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Abstract 

Rainfall plays a key role in hydrological application and 
agriculture in wet climatic regions. Lack of short-run rainfall 
forecasting is considered as a significant impediment for 
scheduling the root zone moisture preparation. Although 
many mathematical techniques are available for use, basic 
concerns remain unsolved such as simplicity, high accuracy, 
real time use in many stations of a region, and the low 
availability of inputs. In this study, a nonlinear modeling 
with Gamma Test (GT) has been presented to solve some of 
the mentioned problems. Forecasting seasonal and annual 
rainfall with the variables of four years lagged rainfall data 
and geographical longitude, latitude and elevation has been 
performed in the North of Iran during 1956-2005. The results 
show that Gamma Test is an effective tool for rainfall 
forecasting. The applied nonlinear modeling techniques are 
Local Linear Regression (LRR), Dynamic Local Linear 
Regression (DLLR), and three separate Artificial Neural 
Networks (ANN) using Back Propagation Two Layer, 
Broyden-Fletcher-Goldfan-Shanno (BFGS), and the 
Conjugate Gradient training Algorithms. The training and 
testing data are partitioned by random selection from the 
original data set. Not only does the Gamma Test yield the 
best input combination, but also the model's good 
performance leads to the best achievable result. The study 
results demonstrate that developed models based on Local 
Linear Regression (LRR) technique have better performance 
comparing with ANN models. Also, developed ANN model 
based on Back Propagation Two Layer training Algorithm is 
preferred because of its better performance compared with 
the other ANN models. 
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Introduction 

Predicting the hydrological variables like rainfall, 
flood stream, and runoff flow as stochastic or 
probabilistic events, is one of the principal subjects in 
water resource planning. The hydrological variables 



are usually measuring across the time. Therefore, time 
series analysis of their occurrences in discrete periods 
is urgent for monitoring and simulating the 
hydrological behavior of a region. Rainfall Forecasting, 
as the most affecting factor on hydrological cycle, is 
vital in water resources management, irrigation 
scheduling, and agricultural management especially in 
humid climates (Mimikou, 1983; Hamlin et al, 1987). 
In wet and semi-wet climates, irrigation isn't common 
and farmers use rainfall water for supplying crop 
water requirements. When rainfall isn't enough the 
supplemental irrigation will be applied. Therefore 
forecasting, modeling and monitoring of rainfall are of 
a high importance in agricultural actions (Geng et al., 
1986; Hoogenboom, 2000; Sentelhas et al, 2001). 

Notably, while weather forecasting deals with daily 
development of the weather up to several days ahead, 
seasonal forecasting is concerned with the average 
weather condition on timescale of a month to about a 
year ahead. Seasonal forecasts are also known as long- 
run weather forecasts or short-run climate forecasts 
(Chang et al., 2003). Because seasonal forecasts give 
information of several months ahead, they can be used 
by government, business, agriculture, and industry to 
increase productivity, maximize economic benefits 
and minimize losses. Specific examples of the 
applications of seasonal forecasts are presented in 
ECMWF (1999). The seasonal forecasts based on slow 
variation in the earth's boundary conditions (i.e. sea 
surface temperature, soil wetness, and snow cover) 
can influence global atmospheric circulation and 
rainfall, too (Fu et al., 2007; Rajeevan, 2007; Gonzalez 
et al., 2009; Ousmane et al., 2011). A detailed 
discussion of the differences between weather and 
seasonal forecasting can be found in WMO (WMO, 
2002). 

In last decades, researchers developed many empirical 
methods in the form of statistical or analogue models 
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with a long history in seasonal forecasting (Bell, 1976; 
Hui et al., 2000). Statistical methods based on historical 
observed data, try to build relationships between 
predictors (e.g. sea surface temperature (SST), 
atmospheric parameters) and predictands (e.g. rainfall 
and temperature). Gilbert Walker used them at the 
first part of 20th century to forecast Indian monsoon 
rainfall (Allan et al., 1996). Analogue methods try to 
find matches between past cases and the current case, 
if the initial conditions are alike; the climate pattern 
would evolve in much a similar way (Chang et al., 
2003). Empirical models are easy to run and need 
relatively little computational resources. The major 
disadvantage is that they try to predict complex 
nonlinear atmosphere-ocean processes by linear 
relationships. They use Markov model (singular value 
Decomposition), optimum climate normals, regression, 
and canonical correlation analysis (Reason, 2001; 
Gissila et al., 2004; Singhrattna et al., 2005; Frederiksen, 
2006; Ousmane et al., 2011; Shamsnia et al., 2011). 
Among suggested techniques, Markov chain has been 
used the most (Caskey, 1963; Gates etal., 1976; Delleur 
et al.,1978; Garbutt et al., 1981; Richardson et al., 1984; 
Geng et al., 1986; Katz, 1977; Richardson, 2000), 
though Markov chain is mostly applied for 
considering rainfall occurrence. Moreover, some 
researchers present this technique combined with 
other techniques like Gamma, exponential 
distributions for finding rainfall value in rainy days 
(Woolhiser et al., 1982; Sanchez-Cohen et al., 1997; 
Aksoy, 2000; Fooladmand, 2006). Also, the Markov 
chain is usually used for short timescale such as daily 
data (Haan et al., 1976; Chin, 1977; Buishand, 1977; 
Bruhn et al., 1980; Coe et al. 1982; Mimikou, 1983; 
Woolhiser et al., 1986; Geng et al., 1986; Hanson et al., 
1990). 

The other approach to seasonal forecasting which is 
more recent is dynamic modeling. Dynamic models 
use prognostic physical equations: atmospheric 
general circulation models, two-tiered coupled ocean- 
atmosphere climate models (first predict SST and then 
climate), fully coupled ocean-atmosphere-land-ice 
general circulation models (CGCMs) that predict 
ocean and atmosphere together (Ousmane et al., 2011). 
Dynamic models try to predict the complex 
atmosphere-ocean processes using the nonlinear 
equations of mass conservation, motion, and energy. 
They need enormous computer resources to run, but 
can better simulate the physical processes and 
therefore have the potential to produce more accurate 
forecasts (chang et al., 2003). The rapidly increasing 



power and falling costs of computers have resulted in 
a growing popularity in the use of dynamic models. 
The reader is referred for reviewing the global 
atmospheric models and their performance to Gates et 
al. (1999), for the dynamic models to Dalcher et 
al.(1988), Latif et al. (1994), Trenberth (1997), Gadgil et 
al. (1998), Anderson et al. (1999), Krishnamurti et al. 
(1999), Derome (2001), Gadgil et al. (2005), Krishna 
Kumar et al. (2005), Saha et al. (2006), Wang et al. (2005) 
and Wang et al. (2009) and for studying the 
comparison of forecasting skills of empirical models 
versus dynamic models to Shukla et al.(2000), Wang et 
al.(2001), Glantz (1998), and Anderson et al. (1999). 

In last decades, for simulating and modeling of the 
systems behavior that are usually nonlinear 
multivariate, unknown, and noisy with high 
uncertainty, researchers used the potentiality of other 
tools; Such tools, that are applicable for forecasting 
rainfall, include mostly Artificial Neural Networks; 
ANN, Fuzzy Inference System; FIS, Adaptive 
NeuroFuzzy Inference System; ANFIS, and Artificial 
Intelligent; Al (French et al., 1992; Halff et al., 1993; 
Ozelkan et al., 1996; Wong et al., 2003; Galambosi et al., 
1999; ASCE, 2000a, b; Sahai et al., 2000; Hadli et al., 
2002; Karamouz et al., 2004; Maria et al., 2005; Suwardi 
et al., 2006; Kumarasiri et al., 2006; El-Shafie et al., 2007; 
El-Shafie et al., 2008; El-Shafie et al., 2009; Fallah- 
Ghalhary et al., 2009; El-Shafie et al., 2010a, b, c; El- 
Shafie et al., 2011). 

Despite a plenty of studies on prediction and 
modeling of seasonal and annual rainfall as empirical 
statistic and dynamic models with ANNs and FISs, the 
application of nonlinear and nonparametric models 
and lagged time series data have not been much 
considered. Also, there is still certain question to be 
answered like which lagged data are relevant to make 
a reasonable model. These concerns can be effectively 
tackled by using novel technique called the Gamma 
Test (GT). The GT was first reported by Koncar (1997) 
and Agalbjorn et al. (1997) and later improved and 
discussed in details by many other researchers 
(Durrant, 2001; Tsui et al., 2002). The domain of a 
possible model is now restricted to the class of smooth 
functions bounded first partial derivatives. The basic 
idea is distinct from the earlier tries with nonlinear 
analysis. Before model construction, the Gamma Test 
evaluates and estimates the best mean-squared error 
for a given selection of inputs that can be achieved by 
any smooth model on unseen data. This technique can 
be used to find the best embedding dimension and 
data length for modeling to achieve a particular target 
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output. A formal mathematical justification of the 
method can be found in Evans and Jones (2002). 

Accordingly, the objective of the study reported here 
is to apply the Gamma Test capability for specification 
of the affecting parameters on seasonal and annual 
rainfall. Also, it makes use of GT-derived input data 
for nonlinear modeling of rainfall with Local Linear 
Regression (LLR) and Artificial Neural Networks 
(ANNs). However, evaluating nonlinear models is 
carried out in training and validation phases after 
model construction. 

Method and Materials 

The Study Area and Used Data 

Mazandaran province is in north of Iran with wet and 
very wet climate (based on Domarten method), and it 
is selected as the study area. This region is near 23842 
square Kms. For carrying out the study, we used 
monthly rainfall data that have been collected from 
four synoptic stations including Gorgan, Rasht, 
Ramsar and Babolsar; some of meteorological and 
geographical characteristics of these stations are 
presented in TABLE 1. The average rainfall of winter, 
spring, summer, autumn seasons and annual is equal 
to 434, 255, 126, 198, and 1013 mm, respectively. The 
rainfall time series are from 1956 to 2005 with a total of 
204 monthly records after removing the missing data. 
Meteorological data were gained from weather 
database of meteorology Organization of Iran. In this 
research, after selecting the 204 records and according 
to the principal objective, the initial inputs which 
influence outputs were determined. Outputs of 
forecasting models were summer, spring and annual 
rainfall. The forecasting seasonal rainfall especially in 
summer and spring in north of Iran is so important, 
because it is in accordance with the growth season of 
summer crops. During this period temperature and 
crops evapotranspiration is high and farmers need 
scheduling for supplying crop water requirements. As, 
the farming year in Iran starts from October, rainfall 
data are arranged based on it, initially. Therefore, a 
time series based on four-year monthly lagged data 
has been provided. Moreover, the average seasonal 
and annual rainfall, height of sea level and 
geographical longitude and latitude are selected as 
inputs. Because, there were many input variables, total 
analyses were carried out for two distinctive input sets: 
1) the average seasonal and annual lagged rainfall data 
were just used as inputs and 2) the average seasonal 
and annual lagged rainfall with monthly lagged 



rainfall data were used as inputs. According to the 
inputs and outputs, we proposed six models as in 
TABLE 2. 



TABLE 1 GEOGRAPHICAL AND METEOROLOGICAL 
CHARACTERISTICS OF FOUR SYNOPTIC STATIONS 



Parameters 


Synoptic Stations 


Gorgan 


Baolsar 


Ramsar 


Rasht 


height (m) 


13.3 


-20 


-21 


-6.9 


longitude 


5416e 


52 39 e 


52 39 e 


49 36 e 


latitude 


36 51n 


36 43 n 


36 43 n 


37 15 n 


mean monthly 
rainfall (mm) 


51.6 


72.5 


101.8 


113.4 


mean seasonal 
rainfall (mm) 


154.8 


217.4 


305.5 


340.2 


mean annual 
rainfall (mm) 


619 


870 


1222 


1361 



TABLE 2. Forecasting models and Initial inputs 



Sets 


Model 
No. 


output 


Geographical 
data 


Combination Inputs 
rainfall 


First 


Model I 


Annual 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Seasonal and annual 
rainfall Lagged for 4 
year (inputs 4 to 25) 




Model II 


Spring 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Seasonal and annual 
rainfall Lagged for 4 
year (inputs 4 to 25) 




Model III 


Summer 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Seasonal and annual 
rainfall Lagged for 4 
year (inputs 4 to 25) 


Second 


Model IV 


Annual 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Monthly, Seasonal 
and annual rainfall 
Lagged for 4 year 
(inputs 4 to 41) 




Model V 


Spring 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Monthly, Seasonal 
and annual rainfall 
Lagged for 4 year 
(inputs 4 to 41) 




Model VI 


Summer 
Rainfall 


Height, latitude 
and longitude 
(inputs 1 to 3) 


Monthly, Seasonal 
and annual rainfall 
Lagged for 4 year 
(inputs 4 to 41) 



Time series analysis is complicated because of the fact 
that we probably do not know how far back in time 
we should look to build our prediction model. This 
initial decision is not irrevocable and should be guided 
by some degree of commonsense analysis on what is 
likely to be the case for the given data set and how 
many data are available. But, the first considerations 
showed that four-year lagged data yielded proper 
models and we accepted this assumption and we did 
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not use a longer lagged period. For this assumption 
and according to two selected data sets and six models, 
we used inputs presented in TABLE 2. These inputs 
and outputs normalized before analysis. Also, records 
were divided into two phases, randomly; training and 
validation phases with 146 and 58 records, 
respectively. 

Gamma test 

The trends of almost climatological variables such as 
rainfall are complex and involve nonlinear dynamic 
systems that usually are unknown. Therefore, data- 
driven modeling is useful for modeling especially 
when the inner workings of the systems aren't 
understandable. Gamma test, as one of such analytical 
tools, assists to select input data before modeling (i.e., 
its result is independent of the models to be 
developed). The Gamma test can model the unseen 
data with any continuous nonlinear models using 
minimum mean square error (MSE) estimation 
(Remesan et al., 2008). Also, one reason the Gamma 
test is so useful is that it can immediately tell us 
directly from the data whether we have sufficient data 
to form a smooth non-linear model and how well that 
model is liable to be (Dunn et al., 2001). As before 
explained, the Gamma test was firstly reported by 
Koncar (1997) and Agalbjorn et al. (1977) and later 
discussed in details by many researchers (Durrant, 
2001; Jones et al., 2002; Evans, 2002). In this research, 
WinGamma software was used, which has been 
developed for accomplishing GT process. Some 
definitions used in software and Gamma test 
processes are given as follows (Jones, 2001): 

Model: The basic idea is quiet distinct from the earlier 
attempts with nonlinear analysis. A smooth data 
model is a differentiable function from inputs x = (xi... 
xm) containing predicatively useful factors that can 
influence the output. It is assumed that the data can be 
represented by an unknown model, so: 

y = f(x u ...,x m )+r (1) 

Where the input vectors xie R m are vectors confined to 
some closed bounded set CeR m and, without loss of 
generality, the corresponding outputs yieR m are 
scalars, and is a random variable that represents noise. 
Without loss of generality it can be assumed that the 
mean of the distribution is zero and that the variance 
of the noise var(r) is bounded. The domain of a 
possible model is now restricted to the class of smooth 
functions which have bounded first partial derivatives. 
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Gamma Test: An algorithm to estimate the variance of 
the noise module Var(r) on each of the outputs is 
bounded and independent of the input values. For 
each choice of inputs found out, as the number of data 
points increases, we try to set up the asymptotic 
Gamma statistic for each output. Both the inputs and 
outputs should be continuous real variables from 
some bounded range. The underlying function 
presumes smooth and this means bounded first and 
second derivatives. If the independence condition is 
false, this is not necessarily fatal, and the Gamma test 
will return an average noise variance over the whole 
input space. This test is used to show how the Gamma 
statistics estimation varies as more data is used. 
Eventually, if enough data are used, the Gamma 
statistic should converge to the true noise variance on 
the output for which it has been computed. The 
Gamma test calculates the mean-squared p th nearest 
neighbour distances 5(p) ( l<p<p max ) and the 
matching y(p). Although, the Gamma test is an 
unknown function of, it can directly estimate Var(r) 
from data: 

^M(P) = ^Z|*NM-*i| (2) 

and y(p), is: 

1 M / \ 

Xm(P) = ^ZWpI-yJ Mp^Pmax) (3) 

Finally, the fitted regression line passes through 
S M (V) > Ym (P))(l < V ± Pmax ) Points, like: 
y = AS + T ( 4 ) 

The vertical intercept of the (5(p), y(p)) regression line 
referred to "Gamma Statistic, T ". Effectively, T is the 
limit y as 5— »0, which in theory is Var(r). Also, 
gradient (A) is an index of model complexity, as the 
lager value of gradient represents the more model 
complexity (Jones, 2001). 

Near Neighbor: This records the index of the k th 
nearest neighbor that has setTABLE boundary in the 
Gamma test. When estimating the Gamma statistic, 
pmax should be selected proportional to the size of the 
data set. In general, in a Gamma test experiment, we 
should keep the number of near neighbors less than 30. 
Usually 10-20 is a good choice (Jones, 2001). We wish 
to find the nearest set of points to a query point with 
near Neighbor search. 

M-Test: The M-test is a way to assess whether the 
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Gamma statistic estimates Var(r) reliably. It is 
performed by computing the Gamma statistic for a 
given subset of the available data. Whereby at each 
successive calculation of the Gamma statistic we 
increase by a insignificant extension, until we have 
either used all the data or the statistic has converged 
enough towards a fixed value. 

Model Identification: This is used to select those 
inputs which can be best applied to predict a selected 
output (some inputs may be noisy or irrelevant). The 
most applicable model identification techniques are 
Full Embedding, Genetic Algorithm, Hill Climbing, 
Sequential Embedding and Increasing Embedding. 

Mean Squared Error (MSError): If y (i) (1, 2... M) is a 
set of values of an output and y*(i) is a set of 
predictions for y(i) then the MSError of the predictions 
is: 



1 M 

MSError = — J>* (*')-!#)) 



(5) 



Standard Error (SE): The standard error of regression 
line is calculated as follows: 



SE(r) = 



V 



n-2 



r max , 



(6) 



i=l 



Where, identifier i is the ith Gamma regression point 
value and T is its mean. 

The Modeling Procedures 

In this study, the Gamma test explored different 
combinations of inputs to assess their influence on the 
rainfall forecasting. There were meaningful 
combinations of inputs; from which, the best one can 
be determined by evaluating the Gamma value. This 
shows a measure of the best attainable estimation 
using any modeling methods for unseen smooth 
functions of continuous variables. We divided data 
into two parts; training data (70% of data) and testing 
data (30% of data), before modeling. When, we choose 
the set of inputs for a particular output that has the 
minimum asymptotic Gamma statistic - this is known 
as model identification. According to the selected 
inputs and output in training period, using the 
WinGamma software, rainfall forecasting models were 
built by: 1) Static local linear regression, 2) Dynamic 
local linear regression, and 3) three different types of 
neural network training algorithms. The ANNs 
contain two layer back propagation, Conjugate 
gradient descent and BFGS neural network. 
Predictions on new input data for which the outputs 



are unknown can also be made using the best 
identified model. 

Local linear regression models are fast to make. These 
models can also be easily updated as new training 
data becomes available, which is not the case with 
neural networks. Indeed WinGamma also offers a 
dynamic local linear regression option which is exactly 
local linear regression with dynamic updating. This 
choice is useful for time series prediction and then it is 
not used in this research. Neural network models cost 
time to compose but in parts of the input space where 
data are sparse, their generalization is better than local 
linear regression. Neural networks can predict at 
blinding speeds compared with local linear regression 
based algorithms, so for some applications it is well 
worth the time and effort to set up a neural model. 

Local linear regression: Local Linear Regression (LLR) 
can produce accurate predictions in regions of high 
data density in input space, but it is liable to produce 
unreliable results for non-linear functions in regions of 
low data density. 

Dynamic local linear regression: It is basically 
identical with LLR with the extra feature that as new 
data are seen for the first time they are incorporated 
into the model. You can see its effect by starting the 
model with little training data and running a test on 
many data. As new test data is encountered, dynamic 
LLR will make steadily better predictions. This 
Method is mainly applicable for the time series 
analysis (Jones, 2004). 

Two layer back propagation: This technique uses the 
standard back propagation algorithm to produce a 
two-layer feed forward neural network. With all the 
neural network training algorithms, one should note 
the choice to recalculate the target MSError. This is 
useful if a part of the data for training and testing has 
been altered. Two layer back propagation also needs: 
a) the initial learning rate with positive value that 
controls the first step size in weight adjustment, b) 
Momentum constant which is positive, and controls 
the extent to which the size and direction of the 
current step in weight space is influenced by the size 
and direction of the previous step, and c) 
Regularization constant that is positive, and limits the 
size of weights. 

Conjugate gradient descent: This shows variation and 
improvement on two-layer vanilla back propagation, 
and it is more effective but wants more memory. The 
procedures for set up are similar. 
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BFGS neural network: BFGS neural network training 
algorithm is a quasi-Newton method performed 
iteratively using successively improved estimations to 
the inverse Hessian. It provides progressive 
adjustment of the neural network weights by gradient 
descent (Fletcher, 1987). Probably the fastest and the 
most efficient neural network training algorithm 
offered by winGamma is a varied version of the 
Broyden-Fletcher-Goldfarb-Shanno learning algorithm. 
This algorithm uses second differences and is 
sometimes degraded by very noisy data, but generally 
it is proper to use this alternative first when trying to 
produce a neural model. We know that feed forward 
networks with as few as one hidden layer can act as 
universal approximation for continuous functions over 
a compact set (Cybenko, 1989; Hornik et al, 1989). 
Details of such modeling for chaotic systems can be 
found in (Jones et al, 2002; Tsui et al., 2002), and 
(Evans et al., 2002). 

Model Selection Criteria 

For evaluation constructed model, we used three 
reference statistics containing logical values. These 
three reference statistics are Correlation Coefficient (R), 
Root Mean Standard Error (RMSE) and Mean Biased 
Error (MBE). 

If known_y's and known_y*'s are observations and 
predictions respectively and have a different number 
of data points, RMSE equation for the standard error is: 



RMSE - 



1 

M^2 



Z(y*-y*) 2 



fc>-y)(y*-y*)] 2 
Z(y-y) 2 



(7) 



the equation for the Pearson product moment 
correlation coefficient, R, is: 



I(y*-r) 2 



\Z(y-yXy*-y*)f 



Z(y-y) 2 



and the mean biased error, MBE, is: 



MBE = -j{y-y) 



Results and Discussion 



(8) 



(9) 



Data Analysis and Model Input Selection Using the 
Gamma Test 

The GT estimates the minimum mean square error 
(MSE) that can be achieved when modeling the unseen 
data using any continuous nonlinear models. As 
mentioned, discovering effective parameters on 



annual and seasonal rainfall is difficult and time- 
consuming. Therefore, much vital information is 
derived from rainfall data with different lags using the 
Gamma test. The GT provides input data guidance 
before a model is developed and greatly reduces 
construction time of the model. At first, we loaded all 
data to WinGamma and considered rainfall time series 
and tried to find the best embedding (i.e. the 
embedding with T closest to zero). But, before 
selecting the best embedding we should determine 
near neighborhood and the number of inputs. The 
measurement data noise and sampling rate are the 
basis for finding out the near neighbor in the Gamma 
test. If the data are noisy, this adjusTABLE factor will 
be larger to get a reliable Gamma value. Also, high 
rate of measurement sampling needs many near 
neighbors. However, if the measurement sampling 
rate is low, too many near neighbors will make the 
Gamma value fuzzy. A compromise needs to select a 
suiTABLE number of near neighbors, so the Gamma 
value is relatively reliable and close to its true value. 
We tested different near neighbors and selected a 
suiTABLE amount of pmax for different data sets. 
Neighbor values earned 16 for Annual rainfall models 
I and IV, 13 for spring seasonal rainfall model II, 10 for 
spring seasonal rainfall model V and summer seasonal 
rainfall model VI and 20 for summer seasonal rainfall 
model III (TABLE 3). 

One of the key questions we need to answer 
practically is how much data we need to get an 
accurate estimation of Gamma, and subsequently to 
build a model which can be predicted with suiTABLE 
accuracy. Answering this question, we run the Gamma 
test using increasing M and then plot a graph of Gama 
values against M values. Typically, what will happen 
is that for small M the graph will have much 
variability, but as M increases the graph will stabilize 
to an asymptote which reflects the true value of the 
noise variance. When the graph has stabilized, there is 
nothing more to gain by using a larger M sets and it is 
maximum number of points shared in nearest 
neighbors' selection. Therefore, the quantity of data 
was analyzed using M-test and selecting sufficient 
data to provide an asymptotic Gamma estimate and 
subsequently a reliable model. The results showed that 
there was sufficient data around M=198 data points, so 
all the data were used for selecting inputs. Moreover, 
available data are relatively suiTABLE in forecasting 
annual and seasonal rainfall. These are values what 
the graph stabilizes such that we can have some 
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confidence that our estimate is reasonably accurate 
(Fig. 1 and TABLE 3). 




1 1 1 1 1 1 1 1 1 1 1 1 ' ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

10 20 30 40 50 60 70 SO 90 100 110 120 130 140 150 160 170 160 190 
Unique Data Points 



^^ajTima^^Ratio^Jj 

FIG. 1 M-TEST FOUND FOR DATA SETS OF MODEL IV 

We have seen that combining Gamma regression line, 
scatter plot, and M-test can provide an estimate of 
Var(r) as a qualitative degree of confidence. Also, this 
combination marks the measure of the best index 
MSError attainable for modeling the unseen smooth 
functions of continuous variables. Here we can see 
interesting variations of the best MSError level with 
different input combinations in all models. Since a 
single Gamma test is a relatively fast procedure it is 
possible to find that selection of inputs which 
minimizes the asymptotic value of the Gamma statistic 
and it makes the 'best selection' of inputs. Thus, 
expected inputs were assessed by the Gamma test and 
classified into two categories; effectiveness and non- 
effectiveness. As far as the inputs are many (41 inputs 
for annual rainfall model I), the possible combinations 
are too much, (2 41 -1=2199023255551 combinations); 
therefore, running Gamma test is impractical for all 
the combinations. For resolving it, we used three 
shortcut approaches of model identification: Genetic 
algorithm, Hill climbing and Sequential Embedding 
for selecting the best inputs. The mentioned methods 
presented the different combinations of inputs with 
the lowest Gamma statistic and MSError level. In this 
study, Genetic algorithm was used for the best 
selection, more often. We examined possible 
embeddings. However, the minimum value of was 
observed when we used the lagged different input 
data sets, and the best embeddings were presented in 
TABLE 3. The gradient (A) is considered as the 
indicator of model complexity (a larger value gradient 
indicates a model of greater complexity). A low MSE 
and low gradient data model can be considered as the 
best scenario for modeling. V-Ratio measures the 
degree of predictability for given outputs using 
available inputs. The smaller value of V-Ratio was 
observed when we considered all the inputs. We can 
see that the various combinations of lagged rainfall 



data influence outputs and can make a good model 
and don't need to apply all inputs. Notably, the lagged 
data through 4 years ago have high effect on output in 
all models. 

Nonlinear Model Construction and Testing 

After selecting the "optimal" inputs with Gamma test, 
we built predictive models for six sets of outputs and 
performed the usual analysis. As the model 
identification process is massive, we summarized their 
implications. Two types of models were constructed 1) 
LLR models and 2) ANN models. Nonparametric 
producer based on LRR models does not need training 
in the same way as neural network models. But, we 
randomly divided data set into two parts: training and 
validation. For constructing LLR models, the optimal 
number of near neighbors was determined by trial and 
error, that principally depend on the noise level. 

TABLE 3 THE GAMMA TEST AND THE BEST SELECTIVE MASKS 
AND THEIR PERFORMANCE CRITERIA FOR FORECASTING 
FUNCTION IN DIFFERENT MODELS (INCLUSION AND 
EXCLUSION INDICATED BY A 1 OR IN THE MASK 
RESPECTIVELY) 



Parameters 


Modell 
I 


Modell 
II 


Modell 
III 


Modell 
IV 


Modell 
V 


Modell 
VI 


Annual 


Spring 


Summer 


Annual 


Spring 


Summer 


Selected 
mask 


1111001 
1101100 
1110101 

1111 


1011011 
1011101 
1110111 
0010 


01010111 
01110011 
11101111 



1101000 
0111100 
0110101 
1101101 
0011011 

linn 


0110111 
0100001 
0101111 
1100110 
1101111 
110101 


11101100 
11011010 
01110000 
01111101 
11010011 



Gamma (r) 


5.88e-6 


0.0627 


0.0738 


4.14e-7 


5.21e-5 


4.09e-6 


Gradient (A) 


0.0234 


0.0451 


0.0262 


0.0117 


0.0385 


0.0337 


MSError 


0.0041 


0.0142 


0.0118 


0.0048 


0.0155 


0.0120 


Coef. of 
Deter. (R 2 ) 


1 


0.7488 


0.7046 


1 


0.9998 


1 


V-Ratio 


2.35e-5 


0.2512 


0.2954 


1.65e-6 


0.0002 


1.64e-5 


Neighborhood 
values 


16 


13 


20 


16 


10 


10 


M values 


198 


198 


198 


198 


198 


198 



A proper number of near neighbors was 13 to 15 for 
LLR models. The performance of LLR models were 
compared to developed models based on neural 
network technique. The various general statistics were 
applied to select the best models and to compare the 
results of the LLR and the neural networks models. 
The used statistics were namely correlation, root mean 
squared error (RMSE) and MBE. The details of 
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modeling statistics are given in TABLE 4 for the 
validation phase. In this study, other than the three 
different ANN models, we constructed ANN models 
trained with various hidden layer neuron number 
combinations and selected the best value for the 
number of hidden layer neurons, and their 
performance was compared to other models (TABLE 
4). 

Forecasting rainfall using the LLR model resulted in 
the best statistics value. For all presented models, it 
was seen that the LLR model had a better performance 
compared to BFGS, conjugate gradient and two layer 
back propagation ANN models in the training and 
validation. From TABLE 4, one can find that models 
presented based on ANN is struggling to produce 
high quality performance. In general, for ANN models, 
the results of the study also indicate that the predictive 
capability of constructed model based on two layer 
back propagation neural network is better compared 
to BFGS algorithm and conjugate gradient networks 
for all the mentioned statistics. The comparative 
analysis of these models using mentioned basic 
statistic has been carried out for the training and 
validation and the results of validation period are 
shown in TABLE 4. 

Moreover, it was seen that the extracted results for 
forecasting annual and spring rainfall have superior 
performance when input variables are the 
combinations of monthly and seasonal data to only 
seasonal rainfall data used. But this is not validated for 
modeling summer rainfall. Also, the best modeling 
results by Gamm test obtained for forecasting annual 
rainfall, especially when input variables are 
combinations of monthly and seasonal rainfall, namely 
Model IV. 

We graphically presented the more complete results of 
Local Linear Regression models in FIGURES 2 and 3. 
FIGURE 2 shows scatter plots of computed and 
observed annual (Model I and Model IV), spring 
(Model II and V) and summer (Model III and VI) 
rainfall during training and validation phases. 
Moreover, FIGURE 3 is a close up view of the actual 
annual rainfall and forecasting results of Local Linear 
Regression model comparison on a subset 100 on the 
test data constructed. The applied inputs include 
height, latitude and longitude and monthly, seasonal 
and annual rainfall for 4 years ago. We clearly found 
out the LRR models, which use different combinations 
of inputs, works well in forecasting. 



TABLE 4. Comparison of the general statistics values of 

VALIDATION FOR THE SELECTED MASKS BASED ON TABLE 3 
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FIG. 2 A COMPARISON OF THE ACTUAL RAINFALL DATA 
AND PREDICTION OUTPUT BASED ON LRR MODEL (A TO F) 



Conclusion 

Rainfall forecasting plays an important role in water 
resource, agriculture and environment management. 
We've investigated the prediction models that are 
simple, applicable, and accurate and also need 
reachable data. Therefore, constructing models based 
on only lagged monthly rainfall and its timely 
combination is mentioned and different models are 
created and tested. 

In application areas, such as meteorological modeling, 
where the underlying processes have high uncertainty 
and caveats and are conjectural, applying Gamma test 
to the selection of relevant variables in the 
construction of nonlinear models is a useful technique. 
In this study, we have illustrated how Gamma test is 
in combination with nonlinear techniques engaged in 
the construction of non-parametric smooth models for 
forecasting rainfall. This study deals with an approach 
to predict rainfall in north of Iran just using lagged 
monthly rainfall data sets for four years ago and 
geographical longitude, latitude and elevation in every 
station. The nature of selecting input variables were 
analyzed by considering the effects of different input 
combinations on general statistics related to Gamma 
test. The quantity of data needed to construct proper 
models for forecasting annual, spring and summer 
rainfall was determined using M-test in WinGamma, 
which has identified to 198. 

Also, we have demonstrated the use of nonlinear 
modeling methods such as Local-linear regression 
(LLR) and ANNs with BFGS neural network, 
Conjugate gradient neural networks and Two layer 
back propagation neural network training algorithms 
in modeling annual, spring and summer rainfall. 

LLR models reasonably performed well in comparison 
with ANNs training algorithms in validation. 
Moreover, two layer back propagation neural network 
training algorithms is to be preferred because of its 
better performance compared to the other ANN Model. 
In the meantime, the LRR technique was able to 
provide more reliable estimations compared to ANN 
models. It would be interesting to explore this to 
confirm whether similar results could be repeated in 
other regions in future. 
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FIG. 3 A CLOSER INSPECTION OF THE LRR MODEL PERFORMANCE ON THE (RANDOMIZED) UNSEEN DATA 
SHOWS AN ACCEPTABLE ERROR LEVEL. BLACK - MODEL PREDICTION, BLUE - ACTUAL DATA, RED - ERROR 
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